if we think about it, there are no data that can be non-spatial or non-temporal
in many cases, the spatial or temporal references are not essential
think: brain image of a person: time matters, but mostly referenced with respect to the age of the person, spatial location of the MRI scanner does not
but: ID of the patient does!
and: time of scan matters too!
we will “pigeon-hole” (classify) phenomena into: fields, objects, aggregations
Fields
many processes can be represented by fields, meaning they could be measured everywhere
mathematical models, such as Navier Stokes’ equation, (wikipedia:)
\[\rho \left(\frac{\partial v}{\partial t} + v \cdot \nabla v \right) = - \nabla p + \nabla \cdot T + f\]
What is a mathematical model?
A mathematical model is an abstract model that uses mathematical language to describe the behaviour of a system.
a representation of the essential aspects of an existing system (or a system to be constructed) which presents knowledge of that system in usable form (P. Eykhoff, 1974, System Identification, J. Wiley, London.)
In the natural sciences, a model is always an approximation, a simplification of reality. If degree of approximation meets the required accuracy, the model is useful, or valid (of value). A validated model does not imply that the model is “true”; more than one model can be valid at the same time.
Time series models
we will first look into time series models, because they are
simple
easy to write down
well understood
time series models are roughly divided in
time domain models, which look at correlations and memory, and
frequency domain models, which focus on periodicities
Spatial equivalents are mostly found in (1), although (2) has spatial equivalences as well (e.g. wavelets).
Some data
Consider the following process (\(\Delta t\) = 1 min):
Code
load("./data/meteo.RData") # should be available in the current working directoryplot(T.outside~date, meteo, type='l', ylab =parse(text ="Temperature ({}*degree* C)"), xlab ="date, 2007")title("Outside temperature, Hauteville, FR")
Some questions
how can we describe this process in statistical terms?
how can we model this process?
(how) can we predict future observations?
White noise
Perhaps the simplest time series model is white noise with mean \(m\):
\[y_t = m + e_t, \ \ e_t \sim N(0,\sigma^2)\]
\(N(0,\sigma^2)\) denoting the normal distribution with mean 0 and variance \(\sigma^2\), and \(\sim\) meaning distributed as or coming from.
\(t\) is the index \(t=1,2,...,n\) of the observation, and refers to specific times, which, when not otherwise specified are at regular intervals.
A white noise process is completely without memory: each observation is independent from its past or future.
White noise
Plotting independent, standard normal values against their index (the default for plotting a vector in R) shows how a white noise time series would look like:
White noise
Plotting independent, standard normal values against their index (the default for plotting a vector in R) shows how a white noise time series would look like:
White noise
Plotting independent, standard normal values against their index (the default for plotting a vector in R) shows how a white noise time series would look like:
Autocorrelation
Autocorrelation (or lagged correlation) is the correlation between \(y_i\) and \(y_{i+h}\), as a function of the lag \(h\): \[
r(h) = \frac{\sum_{i=1}^{n-h}(y_i-\bar{y})(y_{i+h}-\bar{y})}{\sum_{i=1}^n (y_i-\bar{y})^2}
\] with \(\bar{y} = \frac{1}{n} \sum_{i=1}^n y_i\)
Autocorrelation of white noise
We can look at the auto-correlation function of a white noise process, and find it is uncorrelated for any lag larger than 0:
Code
plot(acf(rnorm(10000), plot=FALSE), main="ACF of white noise")
Random walk
A simple, next model to look at is that of random walk, where each time step a change is made according to a white noise process: \[y_t = y_{t-1} + e_t\]
Such a process has memory, and long-range correlation. If we take the first-order differences, \[y_t - y_{t-1} = e_t\] we obtain the white noise process.
Further, the variance of the process increases with increasing domain (i.e., it is non-stationary)
Example random walk:
We can compute it as the cumulative sum of standard normal deviates: \(y_n = \sum_{i=1}^n
e_i\):
Let \(e_t\) be a white noise process. A moving average process of order \(q\) is generated by \[y_t = \beta_0 e_t + \beta_1 e_{t-1} + ... + \beta_q e_{t-q}\]
Note that the \(\beta_j\) are weights, and could be \(\frac{1}{q+1}\) to obtain an unweighted average. Moving averaging smoothes the white noise series \(e_t\).
Moving Average: MA(1), MA(q)
Moving average over monthly CO2 measurements on Mauna Loa:
Moving Average: MA(1), MA(q)
Moving averages, over a white noise process (MA(5) in red, MA(20) in blue):
Moving Average: MA(1), MA(q)
Wider moving average filters give new processes with + less variation + stronger correlation, over larger lags
Moving Average: MA(1), MA(q)
Wider moving average filters give new processes with + less variation + stronger correlation, over larger lags
Moving Average: MA(1), MA(q)
Wider moving average filters give new processes with + less variation + stronger correlation, over larger lags
Autoregressive process: AR(1)
An auto-regressive (1) model, or AR(1) model is generated by \[y_t = \phi_1 y_{t-1}+e_t\] and is sometimes called a Markov process. Given knowledge of \(y_{t-1}\), observations further back carry no information; more formally: \[\Pr(y_t|y_{t-1},y_{t-2},...,y_{t-q}) = \Pr(y_t|y_{t-1})\]
\(\phi_1 = 1\) gives random walk, \(\phi_1=0\) gives white noise.
AR(1) processes have correlations beyond lag 1
AR(1) processes have non-significant partial autocorrelations beyond lag 1
The state of \(y_t\) does not only depend on \(y_{t-1}\), but observations further back contain information
AR(p) have autocorrelations beyond lag p
AR(p) have “zero” partial autocorrelations beyond lag p
Autoregressive process: AR(1), AR(p)
As an example, we create (simulate) an AR(1) process with \(\phi_1=0.85\) and \(e\) drawn from the standard normal distribution (mean 0, variance 1).
Autoregressive process: AR(1), AR(p)
As an example, we create (simulate) an AR(1) process with \(\phi_1=0.85\) and \(e\) drawn from the standard normal distribution (mean 0, variance 1).
Autoregressive process: AR(1), AR(p)
Now we create (simulate) an AR(2) process with \(\phi_1=0.5\), \(\phi_2=0.15\) and \(e\) drawn from the standard normal distribution (mean 0, variance 1).
Autoregressive process: AR(1), AR(p)
Partial correlation
Correlation between \(y_t\) and \(y_{t-2}\) is simply obtained by plotting both series of length \(n-2\), and computing correlation
Lag-2 partial autocorrelation of \(y_t\) and \(y_{t-2}\), given the value inbetween \(y_{t-1}\) is obtained by
computing residuals \(\hat{e}_t\) from regressing of \(y_t\) on \(y_{t-1}\)
computing residuals \(\hat{e}_{t-2}\) from regressing of \(y_{t-2}\) on \(y_{t-1}\)
computing the correlation between both residual series \(\hat{e}_t\) and \(\hat{e}_{t-2}\).
Lag-3 partial autocorrelation regresses \(y_t\) and \(y_{t-3}\) on both intermediate values \(y_{t-1}\) and \(y_{t-2}\)
etc…
Partial correlation
Partial correlation can help reveal what the order of an MA(q) or AR(p) series is:
Partial correlation
Partial correlation can help reveal what the order of an MA(q) or AR(p) series is:
Relation between AR and MA processes
Chatfield has more details about this. Substitute the AR(1) as follows